SIDR: Efficient Structure-Aware Intelligent Data Routing in SciHadoop

نویسندگان

  • Joe Buck
  • Noah Watkins
  • Greg Levin
  • Adam Crume
  • Kleoni Ioannidou
  • Scott Brandt
  • Carlos Maltzahn
  • Neoklis Polyzotis
چکیده

MapReduce is a popular framework for distributed, parallel computation that has started to be used in domains quite different from the web applications for which it was designed, including the processing of big structured data, e.g., scientific and financial data. Previous work on using MapReduce to process scientific data did not incorporate knowledge of this structure in its internal communications. We show that performance gains can be realized by leveraging knowledge of the structure of the data to minimize and localize communications between nodes, guarantee workload balance across processing nodes, ensure that Reduce tasks start as soon as possible, and create balanced, contiguous output. We implemented these improvements in SciHadoop, a version of the open-source Hadoop MapReduce framework designed for structured scientific data. Our results show total query execution time reductions of up to 29% over SciHadoop with initial results available with only 6% of the query completed, and the resultant output is more efficiently organized, compared to Hadoop.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EEQR: An Energy Efficient Query-Based Routing Protocol for Wireless Sensor Networks

Routing in Wireless Sensor Networks (WSNs) is a very challenging task due to the large number of nodes, their mobility and lack of proper infrastructure. Since the sensors are battery powered devices, energy efficiency is considered as one of the main factors in designing routing protocols in WSNs. Most of energy-aware routing protocols are mere energy savers that attempt to decrease the energy...

متن کامل

EEQR: An Energy Efficient Query-Based Routing Protocol for Wireless Sensor Networks

Routing in Wireless Sensor Networks (WSNs) is a very challenging task due to the large number of nodes, their mobility and lack of proper infrastructure. Since the sensors are battery powered devices, energy efficiency is considered as one of the main factors in designing routing protocols in WSNs. Most of energy-aware routing protocols are mere energy savers that attempt to decrease the energy...

متن کامل

TAC: A Topology-Aware Chord-based Peer-to-Peer Network

Among structured Peer-to-Peer systems, Chord has a general popularity due to its salient features like simplicity, high scalability, small path length with respect to network size, and flexibility on node join and departure. However, Chord doesn’t take into account the topology of underlying physical network when a new node is being added to the system, thus resulting in high routing late...

متن کامل

Evolutionary Computing Assisted Wireless Sensor Network Mining for QoS-Centric and Energy-efficient Routing Protocol

The exponential rise in wireless communication demands and allied applications have revitalized academia-industries to develop more efficient routing protocols. Wireless Sensor Network (WSN) being battery operated network, it often undergoes node death-causing pre-ma...

متن کامل

Link-stability and Energy Aware Multipath Routing in MANET

Energy conservation is important for mobile ad hoc networks where devices are expected to work for longer periods of time without the need for charging their batteries. Therefore there is a need of an intelligent routing protocol that can minimize overhead and ensure the use of minimum energy routes. In Progressive Energy Efficient Routing, energy efficient shortest paths are selected with mini...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012